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Analysing  English  Noun  Groups 
for  their  Conceptual  Content 


Anatole  V.  Gershman 
Department  of  Computer  Science 
Yale  University 
New  Haven,  Connecticut  06520 


Abstract  ' 

, \ 

An  expectation-based  system,. NGP,  for  parsing  English  noun 
groups  into  the  Conceptual  Dependency  representation  is 
described.  The  system  is  a part  of  ELI  (English  Languagelr 
-Airaly-z«r)^ 'Which  is  used  as  the  front  end  to  several  natural 
language  understanding  systems  and  is  capable  of  handling  a wide 
range  of  sentences  of  considerable  complexity.  NGP  processes  the 
input  from  left  to  right,  one  word  at  a time,  using  linguistic 
and  world  knowledge  to  find  the  meaning  of  a noun  group. 
Dictionary  entries  for  individual  words  contain  much  of  the 
program's  knowledge.  In  addition,  a limited  ability  for  the 
handling  of  slightly  incorrect  sentences  and  unknown  words  is 
incorporated . 

\ 

0.  Introduction 

Every  natural  language  processor  has  to  have  the  ability  to 
interpret  noun  phrases.  This  paper  describes  a set  of  programs 
called  NGP  (Noun  Group  Processor)  which  is  an  integral  part  of 
ELI,  the  English  Language  Interpreter  (Riesbeck  and  Schank  1976) 
which  serves  as  the  front  end  to  three  of  the  Yale  natural 
language  understanding  systems,  SAM,  PAM  and  WEIS.  SAM  is  a 
system  capable  of  understanding  stories  such  as  various  newspaper 
reports  by  using  scripts  (Schank  and  Abelson  1975,  1977; 

Cullingford  1975,  1977).  PAM  is  an  understanding  system  which 
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uses  p,eneral  knowledge  about  peoples'  goals  and  plans  (Wilensky 
1976).  WEIS  is  a system  which  understands  and  classifies  a great 
variety  of  isolated  news  paper  headlines  on  i nternational 
relations.  Thus,  our  task  was  to  process  not  only  noun  phrases 
of  considerable  complexity  but  also  to  interpret  newspaper 
headlines  which  are  not  always  grammatically  correct.  The 
following  two  examples  illustrate  the  kind  of  sentences  our 
system  is  able  to  handle. 

1.  A CONNECTICUT  MAN,  JOHN  DOE,  AGE  23,  OF  3^2  COLLEGE  AVENUE, 

1 ^ j Uf 

NEW  HAVEN  WAS  PRONOUNCED  DEAD  AT  THE  3SEEN  BY  DR.  DANA 
BLAUCHARD,  MEDICAL  EXAMINER. 

2.  FUNERAL  OF  INDIA'S  SHASTRI  ATTENDED  BY  USSR  KOSYGIN  AND  USA 
HUMPHREY. 

To  process  such  a large  scope  of  sentences  the  program  uses 
extensively  its  knowledge  of  the  problem  domain  and  the 
redundancy  of  natural  language  expressions.  This  saves  effort 
and  permits  correct  processing  of  such  irregularities  of  input 
texts  as  missing  commas  and  articles,  slightly  incorrect  word 
order.  It  also  provides  for  the  ability  to  ignore  unknown  words 
or  (in  some  cases)  to  make  plausible  interpretations  of  unknown 
words.  This  knowledge  is  kept  in  the  dictionary.  The  control 
mechanisms  remain  domain  independent. 

NGP  is  a production-like  system  which  uses  expectations  as 
its  basic  control  mechanism.  The  problem  with  every 
production-like  system  is  the  tendency  for  the  accumulation  of  a 
large  number  of  expectations  fighting  for  a chance  to  be 
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tested. In  this  work  I have  tried  to  develop  a theory  of  how 
various  expectations  are  organized  and  processed,  which,  I 
believe,  is  in  fact  a theory  of  how  people  process  natural 
language.  The  basic  guiding  principle  for  this  theory  was  its 
intuitive  plausibility. 

1.  Noun  Group  Semantics 

We  differentiate  four  classes  of  noun  groups  according  to 
the  semantic  structures  they  generate. 

1.  PP  - Picture  Producers 

2.  CTP  - Concept  Producers 

3.  TD  - Time  Descriptors 

4.  SD  - State  Descriptors 

1.1  Picture  Producers 

PP's  are  defined  by  Schank  (Schank  1975)  as  concepts  which 
tend  to  produce  pictures  of  real  world  items  in  the  mind  of  a 
hearer.  For  example, 

( 1 ) A BIG  RED  APPLE 

is  a Picture  Producing  noun  group.  To  understand  such  an  item 
means  to  identify  the  structure  in  the  memory  which  corresponds 
to  this  item  if  such  a structure  exists  or  to  create  one 
according  to  some  frame.  This  is  done  in  two  stages.  In  the 
first  stage,  we  analyze  the  input  phrase  and  translate  it  into  an 
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expression  in  Conceptual  Dependency  (Schank  1972,  1973,  1975). 
This  expression  should  preserve  in  a lan^ua^e  independent  form 
all  the  information  contained  in  the  surface  phrase.  Thus  (1) 
will  generate 

(#PHYSOBJ  COLOR  (x)  SIZE  (y)  DETERM  (INDEF)), 

..here  x and  y are  points  on  the  color  and  size  scales.  In  the 
second  sta(?e,  we  identify  the  CD  expression  with  the  existing 
memory  structures  by  performing  the  necessary  memory  search  and 
feature  matching. 

A CD  expression  for  a PP  consists  of  a header  followed  by  a 
property  list.  The  header  is  similar  to  a superset  pointer  in 
h ierarchical 1 y organized  memory  systems.  It  points  to  a frame  of 
properties  that  the  PP  is  expected  to  have.  The  property  list 
explicitly  given  in  the  CD  expression  must  be  compatible  with 
this  frame.  Thus  a (^PERSON)  is  expected  to  have  FIRSTNAME, 
LASTNAME,  RESIDENCE,  etc.  but  a (//PHYSOBJ)  is  not.  All 
properties  not  included  in  the  frame  must  be  specified  by  a REL 
clause.  For  example, 

(2)  JOHN  DOE,  THE  PASSENGER  OF  THE  CAR 
i s represented  by 

(/^PERSON  FISTNAME  (JOHN)  LASTNAME  (DOE) 

REL  ((<=>  (IDRIVE  PASSENGER  MODFOC US ) ) ) ) , 
where  MODFOCUS  is  a back  pointer  to  the  focus  of  the  REL 
modifier,  i.e.  to  (/(PERSON  ...) 
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SAM's  memory  program  accepts  7 general  classes  of  PR's: 
//PERSON,  //PHYSOBJ,  //ORGANIZATION,  //LOCALE,  //ROAD,  //GROUP,  and 
//POLITY,  which  can  be  illustrated  by  the  folowing  examples: 

(3)  JOHN  = (//PERSON  FIRSTNAME  (JOHN)) 

(4)  TABLE  = (//PHYSOBJ  TYPE  (*TABLE*)) 

(B)  NAVY  = (//ORGANIZATION  BRANCH  (NAVY)) 

(6  ) 593  FOXON  RD  = (//LOCALE  STREETNUMBER  (593) 

STREETNAME  (FOXON) 

STREETTYPE  (ROAD)) 

(7)  ROUTE  69  = (//ROAD  ROADNUMBER  (69) 

ROADTYPE  (HIGHWAY)) 

(8)  JOHN  AND  MARY  = (//GROUP 

MEMBER  (//PERSON  FIRSTNAME  (JOHN)) 

MEMBER  (//PERSON  FIRSTNAME  (MARY))) 

(9)  USA  = (//POLITY  TYPE  (COUNTRY)  NAME  (USA)) 

1.2  Concept  Producers 

Very  often  noun  groups  do  not  describe  any  real  world  items. 
Consider  the  following  sentence: 

(12)  JOHN  VOTED  IN  THE  1976  PRESIDENTIAL  ELECTION. 

THE  1976  PRESIDENTIAL  ELECTION  does  not  produce  a single 
"picture"  in  the  mind  of  the  hearer.  Rather,  it  points  to  a 
complicated  concept  involving  the  names  of  the  candidates, 
primeries,  voters  registration,  etc.  The  knowledge  about  typical 
elections  is  normally  organized  in  a script  like  form.  The  verb 
VOTED  specifies  the  role  John  played  in  the  election  script. 
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Thus,  the  meaning  of  (12)  is  the  invocation  of  the  election 
script  and  the  instantiation  of  the  script  roles.  The  CD 
representation  of  THE  1176  PRESIDENTIAL  ELECTION  produced  by  the 
parser  looks  as  follows: 

(SELECTION  TYPE  (PRESIDENTIAL)  TIME  (1976)  REF  (DEF)), 

where  SELECTION  is  a script  name  and  TYPE  and  TIME  are  script 
parameters.  This  output  is  interpreted  by  the  Script  Applier. 
All  script  names  and  parameters  which  appear  in  the  CD  expression 
must  be  recognizable  (expected)  by  the  Script  Applier. 

1.3  Time  Descriptors 

This  type  of  noun  group  can  be  illustrated  by  the  following 
example : 

(13)  LAST  YEAR  WAS  BAD  FOR  JOHN. 

Sentence  (13)  means  that  something  unspecified  happened  which 
made  John  unhappy  and  that  this  event  (or  events)  occurred  during 
last  year.  LAST  YEAR  does  not  generate  a separate  concept  but 
enters  as  a time  modifier  into  another  concept.  Other  examples 
of  Time  Descriptors  are:  YESTERDAY,  MONDAY  MORNING,  THE  WHOLE 

DAY,  etc. 


I.y  State  Descriptors 

Noun  groups  of  this  class  produce  assertions  about  the 
states  of  PR's.  For  example,  the  meaning  of 

(14)  THE  BEAUTY  OF  THE  PLACE  (struck  John) 
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is  "THE  PLACE  IS  VERY  HIGH  ON  SOME  AESTHETIC  SCALE", 
CD  form: 
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or , in 


[(ACTOR  (((LOCALE  REF  (DEF))  IS  (* AESTHETIC-SCALE*  VAL  (10)] 

Phrase  (14)  is  an  assertion  of  a fact  about  the  place  rather  than 
a PP  with  a modifier  as  in 

(15)  (I  saw)  A BEAUTIFUL  PLACE, 

which  can  be  represented  in  CD  form  as 
(((LOCALE 

REL  ((ACTOR  MODFOCUS  IS  ( »AESTHETIC-SCALE*  VAL  (10)))) 

REF  (INDEF)), 

i.e.  a place  which  is  very  high  on  some  aesthetic  scale. 

2.  Basic  Noun  Group  Parser 

The  goal  and  the  general  methods  of  the  Noun  Group  Parser 
(NGP)  are  identical  to  the  rest  of  ELI,  i.e.  the  goal  of  NGP  is 
the  extraction  of  the  conceptualizations  that  underlie  the  input. 
Expectations  are  its  basic  mechanisms  of  operation.  (See 
Riesbeck  and  Schank  1976).  However,  the  control  structure  and 
the  order  in  which  the  expectations  are  stored  and  tested  in  NGP 
are  very  different  from  those  of  ELI.  To  put  it  briefly,  in  ELI 


all  the 

expectations 

are  placed 

in 

one  pool  and 

are  tested 

whenever 

a new  word 

or  concept 

i s 

considered  . 

NGP  takes 

advantage  of  a relatively  rigid  structure  of  English  noun  groups 
to  select  and  order  suitable  expectations  at  each  point  of  the 
process.  The  program  examines  words  of  the  input  string  from 
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left  to  riRht.  The  basic  loop  of  the  analyzer  consists  of  two 
steps : 

1.  The  dictionary  definition  of  the  current  word  is  loaded  into 
the  active  memory 

2.  Relevant  expectations  are  selected  and  tested.  If  an 
expectation  is  satisfied,  the  actions  associated  with  it  are 
executed . 

This  basic  loop  is  similar  to  the  monitorinR  control  program  of 
ELI  or  any  other  production-like  system.  The  difference  is  in 
the  selection  and  ordering  of  expectations.  This  process  is 
rather  complicated  and  I will  try  to  describe  it  systematically 
and  in  increasingly  greater  detail  throughout  the  rest  of  the 
paper.  I will  begin  by  presenting  the  analysis  of  a simple 
ex  ampl  e : 

(1)  LARGE  CHINESE  RESTAURANT 

First,  NGP  sees  the  word  LARGE.  The  dictionary  definition  of 
LARGE  is  a program  which  can  test  the  environment  when  LARGE  is 
brought  into  the  active  memory  and  build  the  initial  SEMANTIC 
NODE  for  it.  These  semantic  nodes  (called  NGP  nodes  in  the 
program)  are  the  construction  sites  where  various  parts  of  the 
future  CD  expression  are  being  assembled.  The  node  for  LARGE, 
say  NGP1,  has  an  expectation  attached  to  it  which  says  "if  the 
next  semantic  node  is  an  inanimate  PP  then  attach  modifier 
SIZE  (x)  to  it".  NGP1  is  saved  in  a stack  called  MODLIST. 
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The  word  CHINESE  builds  the  semantic  node  NGP2,  whose 
SEMANTIC  VALUE  is  (*CHINA*)  and  which  has  an  expectation  saying 
"if  the  next  semantic  node  is  a //PHYSOBJ  then  attach  the  modifier 
MADEIN  (*CHINA*)  to  it,  if  it  is  a //PERSON  or  an  //ORGANIZATION 
then  attach  the  modifier  PARTOF  (»CHINA*)  to  it".  Having  done 
this,  the  monitor  checks  the  expectation  attached  to  NGP1.  It 
fails  and  NGP2  is  placed  on  the  top  of  MODLIST. 

Next  comes  the  word  RESTAURANT.  It  builds  semantic  the  node 
NGP3  whose  semantic  value  is  (//ORGANIZATION  OCCUPATION 
(RESTAURANT))  and  which  has  an  expectation;  "if  the  PREVIOUS 
semantic  node  can  be  a restaurant  type  then  attach  it  to  the 
current  node".  Now  the  monitor  goes  into  the  expectation  testing 
mode  of  operation.  It  sees  two  sets  of  expectations;  those 
attached  to  NGP2  looking  "forward"  at  NGP3  and  those  attached  to 
NGP3  looking  "backward"  at  NGP2.  Expectations  attached  to  NGP1 
are  not  considered  because  NGP1  is  hidden  by  NGP2.  First,  the 
monitor  tests  those  expectations  of  the  current  node  which  look 
"backward"  (called  BACKWARD  in  the  program).  If  there  are  no 
such  expectations  or  if  all  of  them  fail,  the  monitor  tests  the 
"forward"  expectations  (called  FORWARD  in  the  program)  attached 
to  the  previous  semantic  node.  If  an  expectation  is  satisfied, 
the  stack  is  popped  and  the  process  is  repeated  until  no 
expectations  are  satisfied.  Thus,  MODLIST  intuitively  contains 
those  modifiers  which  have  not  yet  been  attached.  The  current 
node,  which  is  kept  in  NGAP,  is  the  focus  of  assembling 
activities  at  each  step.  In  our  example  (*CHINA*)  can  be  a 
restaurant  type,  the  expectation  is  satisfied,  the  value  of  NGP3 
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is  modified,  and  NGP2  is  removed  from  MODLIST.  The  following 
diagram  illustrates  the  transition: 

BEFORE:  MODLIST  = NGP2,  NGP1 
NGAP  = NGP3 

NGP3  = (//ORGANIZATION  OCCUPATION  (RESTAURANT)) 

AFTER:  MODLIST  = NGP1 

NGAP  = NGP3 

NGP3  = (//ORGANIZATION  OCCUPATION  (RESTAURANT) 

TYPE  (*CHINA»)) 

Now  the  monitor  sees  NGP1  on  the  top  of  the  stack.  Since  NGP3 
does  not  have  any  BACKWARD  expectations  left,  the  FORWARD 
expectation  of  NGP1  is  tested.  Note,  that  at  this  point,  NGP3 
does  not  correspond  to  any  particular  word,  but  represents  the 
combined  meaning  of  CHINESE  RESTAURANT.  LARGE  can  be  attached  to 
NGP3  and  the  resulting  structure  is: 

MODLIST  = EMPTY 
NGAP  = NGP3 

NGP3  = (//ORGANIZATION  OCCUPATION  (RESTAURANT) 

TYPE  (»CHINA») 

SIZE  (x)) 

So  far,  we  have  introduced  the  following  concepts: 

SEMANTIC  NODES  - are  the  nuclei  around  which  all  construction 
activities  are  done.  The  value  of  a semantic  node  is  a 
piece  of  conceptual  structure  which  might  be  used  in 
assembling  of  the  CD  expression  for  the  whole  noun  group. 
BACKWARD  and  FORWARD  - are  two  groups  of  expectations  attached  to 
a semantic  node. 
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NGAP  - holds  the  current  semantic  node. 

MODLIST  - is  a stack  which  holds  all  previous  semantic  nodes. 

The  basic  control  algorithm  of  NGP,  which  was  informally 

described  with  the  help  of  the  above  example,  now  can  be  stated 

in  more  precise  terms: 

STEP1  Read  new  word.  Execute  its  definition  and  put  the 
resulting  semantic  node  in  NGAP. 

STEP2  If  MODLIST  is  empty  then  go  to  STEP?  else  go  to  STEP3. 

STEP3  If  NGAP  does  not  have  any  BACKWARD  expectations  go  to 
STEPS,  otherwise  go  to  STEPM. 

STEP4  Evaluate  BACKWARD  expectations  of  NGAP.  In  case  of  failure 
go  to  STEPS,  otherwise  pop  the  stack  and  go  to  STEP2. 

STEPS  If  the  semantic  node  on  the  top  of  MODLIST  does  not  have 
any  FORWARD  expectations  then  go  to  STEP?,  otherwise  go  to 
STEP6. 

STEPS  Evaluate  FORWARD  expectations.  In  case  of  failure  go  to 
STEP?,  otherwise  pop  the  stack  and  go  to  STEP2. 

STEP?  Put  the  content  of  NGAP  (current  semantic  node)  on  MODLIST 
and  go  to  STEP1. 

The  underlying  assumptions  of  this  algorithm  are: 

(a)  People  read  noun  groups  from  left  to  right. 

(b)  People  do  not  passively  accumulate  words  until  they  decide 
that  they  have  reached  the  head  noun.  Instead,  they  make 
decisions  about  the  interpretations  and  combinations  of  words 
as  soon  as  it  becomes  possible  (i.e.  as  soon  as  an 
expectation  is  satisfied).  Thus,  in  a phrase  MEAT  SHOP 
OWNER,  MEAT  SHOP  is  interpreted  before  OWNER  is  read. 
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(c)  Expectations  attached  to  words  which  come  later  in  the  phrase 
usually  are  stronger  than  those  of  preceding  words.  In  the 
sequence  of  words  of  a simple  noun  group  (like  FEARLESS 
CHINESE  SOLDIER)  words  on  the  right  are  usually  modifiers  of 
some  word  on  the  left.  A modifier  normally  has  FORWARD 
expectations  for  a fairly  large  class  of  items  it  can  modify. 

' On  the  other  hand,  it  is  relatively  seldom  that  a word  is 

looking  for  a particular  modifier  on  its  right.  In  general, 
the  more  specific  the  expectation  is,  the  higher  priority  it 
should  have.  This  is  what  happened  in  our  example  with 
CHINESE  RESTAURANT. 

So  far,  I have  carefully  avoided  one  very  important  problem. 

t 

.! 

My  basic  control  algorithm  does  not  have  a STOP  statement.  Where 
; does  a noun  group  end?  This  problem  is  discussed  in  the  next 

section . 

3.  The  Problem  of  Boundaries 

One  problem  that  any  noun  group  processor  has  to  solve  is 
' the  problem  of  boundaries.  Where  does  a noun  group  end?  In  most 

cases  the  answer  to  this  question  is  quite  simole:  things  like 

verbs,  commas,  prepositions,  and  articles  terminate  most  noun 
groups.  In  practice,  however,  none  of  these  indicators  is  very 
reliable.  Consider  the  following  example  that  NGP  had  to  deal 

with : 


(1)  THE  U.S.  FORCES  FIGHT  IN  VIETNAM  IS  HOPELESS. 
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"1 

This  example  illustrates  the  difficulties  arising  from  the 
ambiguity  of  the  part  of  speech  classification  of  the  words 
FORCES  and  FIGHT.  When  the  context  does  not  provide  an  early 

disambiguation  we  have  to  make  a guess  and  then  later  correct  it  1 

if  necessary.  As  a first  guess,  NGP  collects  the  maximum  number  | 

of  elements  into  a noun  group.  Thus  it  includes  both  FORCES  and  i 

I 

FIGHT  rather  than  stopping  after  THE  U.S.  | 

! 

(2)  BILL,  JOHN,  AND  MARY  LEFT. 

(3)  BILL  KICKED  JOHN,  AND  MARY  KICKED  BILL. 

BILL,  JOHN,  AND  MARY  in  the  second  example  constitute  one 
semantic  unit  - 

(/ifGROUP  MEMBER  (//PERSON  FIRSTNAME  (BILL)) 

MEMBER  (//PERSON  FIRSTNAME  (JOHN))  ^ 

MEMBER  (//PERSON  FIRSTNAME  (MARY)))  I 

But  is  it  reasonable  to  consider  this  phrase  as  a single  noun 

group  on  the  surface  level?  Example  (3)  shows  that  JOHN,  AND 

MARY  might  be  different  groups.  Expectation  external  to  the  noun 

group  must  decide  whether  these  three  words  can  be  clustered  in 
one  group.  The  same  is  true  for  examples  (4)  and  (5),  where  the 
phrase  ON  THE  TABLE  may  or  may  not  be  attached  to  the  noun  phrase 
THE  TABLE. 

(4)  JOHN  SAW  THE  GLASS  ON  THE  TRAY. 

(5)  JOHN  PUT  THE  GLASS  ON  THE  TRAY.  ! 

i 

On  the  other  hand,  the  preposition  OF  in  the  phrase  OF  STATE  in  : 


j 


w 
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example  (6) 

(6)  U.S.  ASSISTANT  SECRETARY  OF  STATE  MARSHALL  GREEN 

is  predicted  by  the  noun  SECRETARY,  and  can  be  interpreted  by  the 
noun  Rroup  processor  without  outside  help.  This  brings  in  the 
following  principle  of  noun  group  processing: 

ANY  UNEXPECTED  WORD  WHICH  IS  INCOMPATIBLE  WITH  THE 
CURRENT  NOUN  GROUP  TERMINATES  THE  GROUP  ON  THE 
PRECEDING  WORD. 

Control  is  returned  to  the  higher  level  routine  which  called  the 
noun  group  and  which  decides  how  the  group  will  be  used.  It 
might  be  attached  to  a preceding  noun  group  or  used  otherwise. 

Semantically,  a phrase  like 

(7)  A RECENT  YALE  GRADUATE,  JIM  MEEHAN,  27,  ASSISTANT  PROFESSOR 
OF  COMPUTER  SCIENCE  AT  UCI  (was  awarded  ...) 

J.S  one  PP  and,  therefore,  should  be  considered  one  noun  group. 
From  the  processing  point  of  view,  we  need  more  restricted 
definition  of  SURFACE  noun  groups.  A SURFACE  NOUN  GROUP  (or, 
simply,  noun  group)  is  a string  of  words  which  can  be  processed 
by  NGP  without  relinquishing  control  to  the  higher  processor. 

What  are  the  rules  of  compatibility  which  determine  the 
boundaries  of  a surface  noun  group?  All  semantic  nodes  that  can 
be  used  in  a noun  group  must  belong  to  one  of  the  following 
classes:  ADJECTIVE,  ADVERB,  NOUN,  TITLE,  NAME,  NUMBER,  DETERM, 
and  BOGUS.  (This  information  is  stored  on  the  node  under  the 
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property  MARKER).  Class  BOGUS  is  reserved  for  unknown  words  and  | 

will  be  discussed  later.  Class  TITLE  contains  all  the  words  I 

which  ca^  be  followed  by  a name:  professor,  doctor,  patrolman,  | 

president,  etc.  The  noun  Rroup  is  processed  from  left  to  ri^ht  j 

as  long  as  the  following  conditions  are  satisfied: 

(1)  Each  word  which  is  not  specifically  expected  must  belong  to 
one  of  the  classes  mentioned  above. 

(2)  No  word  can  precede  a DETERM. 

(3)  ADJECTIVES,  ADVERBS,  and  NUMBERS  cannot  be  preceded  by  either 
NOUNS,  TITLES,  or  NAMES. 

(i()  titles  and  NOUNS  cannot  be  preceded  by  a NAME. 

(5)  A NAME  cannot  be  immediately  preceded  by  a NOUN. 

(6)  A NAME  cannot  be  preceded  by  a DETERM. 

For  example,  phrase  (7)  will  be  processed  as  four  separate  noun 

groups : 

(a)  A RECENT  YALE  GRADUATE  - ends  with  a comma,  but  even  if  this 
comma  were  missing,  the  phrase  would  have  been  terminated  at 
the  same  place  by  NAME,  using  rules  5 and  6 

(b)  JIM  MEEHAN  - ends  with  a comma 

(c)  27  - special  case  of  a noun  group  - an  age  group 

(d)  ASSISTANT  PROFESSOR  OF  COMPUTER  SCIENCE  AT  UC I - ends  with 
WAS  which  is  a verb 

Noun  groups  OF  COMPUTER  SCIENCE  and  AT  UCI  are  processed  without 
leaving  NGP  since  the  word  PROFESSOR  sets  up  expectations  for 


them . 


Rules  (1)  - (6)  are  much  looser  than  the  usual  syntactic 
rules  for  noun  groups  (see,  for  example,  Winograd  1972).  But  our 
Roal  is  not  the  rejection  of  syntactically  incorrect  sentences. 
We  introduce  restrictions  only  where  they  help,  where  their 
absence  creates  disambiguation  or  processing  difficulties. 

The  other  distinctive  feature  of  our  rules  is  that  they  are 
generated  dynamically  and  can  be  changed  by  actions  of  any 
expectation.  This  is  how,  for  example,  possesives  are  handled: 

(8)  POLICE  CHIEF'S  NEW  CAR 

First,  the  node  for  POLICE  CHIEF  is  build: 

NGP1: 

VALUE  = (//PERSON  OCCUPATION  (POLICE-CHIEF)) 

MARKER  = TITLE 

Then  the  program  sees  the  possession  mark  which  satisfies  a 
special  default  expectation.  The  action  of  this  expectation 
transforms  NGP1  into: 

NGP1  : 

VALUE  = (//PERSON  OCCUPATION  (POLICE-CHIEF)) 

MARKER  = ADJECTIVE 

FORWARD  = "If  the  next  node  is  a //PHYSOBJ  then  make  it  POSSBY 
the  value  of  NGP1  (i.e.  by  (//PERSON  OCCUPATION 
(POLICE-CHIEF) ) )" 


i 
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^1.  Putting  Pieces  Together. 


In  the  previous  section  I described  the  basic  noun  group 
processor.  Complex  noun  groups  are  broken  into  simpler  phrases 
which  are  processed  separately.  Separately,  however,  does  not 
mean  independently.  The  previously  built  part  of  the  noun  group 
can  affect  the  analysis  of  the  remaining  parts.  In  this  section 
I will  describe  the  mechanism  of  this  interaction  and  how  various 
parts  of  a noun  group  are  put  together. 

In  accordance  with  our  general  principles,  this  process  is 
driven  by  a hierarchically  organized  set  of  expectations.  There 
are  two  kinds  of  expectations:  (1)  those  dynamically  generated 

by  the  input  and  (2)  default  expectations  supplied  by  the  control 
mechanism.  These  default  expectations  are  designed  to  catch  such 
unexpected  things  as  appositives,  addresses,  age  groups,  etc. 
For  example,  when  we  hear  A CONNECTICUT  MAN  in 

(1)  (The  award  was  given  to)  A CONNECTICUT  MAN,  JOHN  DOE, 

AGE  23,  OF  234  COLLEGE  AVENUE,  NEW  HAVEN. 

we  do  not  necessarily  immediately  expect  to  hear  his  name,  age, 
and  address,  although  we  know  that  as  a person  he  has  these 
characteristics.  These  are  secondary,  default  expectations  which 
are  tested  only  if  other,  explicit  expectations  fail.  In  the 
above  example  the  processing  goes  as  follows: 

First,  A CONNECTICUT  MAN  is  collected,  generating: 

(2)  (//PERSON  GENDER  (MALE) 
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RESIDENCE  (//LOCALE  STATE  (*CONN*))) 

At  this  point,  control  returns  to  ELI  which  tests  the 
expectations  which  were  pending  before  we  reached  this  phrase. 
One  of  these  expectations  is  satisfied  and  its  action  puts 
structure  (2)  into  the  waiting  slot  in  a larger  frame: 

((ACTOR  (NIL)  <=>  (»ATRANS»)  OBJECT  (*AWARD») 

TO  (//PERSON  GENDER  (MALE) 

RESIDENCE  (1/LOCALE  STATE  (»CONN*))) 

The  slot  that  (2)  filled  is  remembered  in  the  variable  called 
LASTNG.  Then  comes  JOHN  DOE.  No  explicit  expectations  are 
satisfied.  The  monitor  goes  to  a special  mode  called  TRAP.  TRAP 
checks  whether  LASTNG  was  a person  and,  if  so,  checks  the  default 
expectations  about  a person.  The  NAME  expectation  is  satisfied 
and  the  specialized  action  which  collects  personal  names  is 
executed.  As  a result  name  modifiers  are  attached  to  the  male 
Connecticut  resident: 

(//PERSON  GENDER  (MALE) 

RESIDENCE  (//LOCALE  STATE  (»CONN*)) 

FIRSTNAME  (JOHN)  LASTNAME  (DOE)) 

After  this,  control  goes  back  to  the  top  level  processor.  This 
reads  the  next  word  "27".  Again,  no  expectations  are  immediately 
satisfied  and  the  monitor  traps  into  the  secondary  expectations. 
The  AGE  expectation  is  satisfied  and  the  specialized  action  which 
collects  AGE  specification  groups  is  executed.  The  result  is  an 
AGE  modifier  which  is  attached  to  John.  OF  231/  COLEGE  AVENGE 
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also  Roes  to  TRAP,  which  calls  the  address  group  processor.  The 
final  resul t is: 

(//PERSON  GENDER  (MALE)  j 

RESIDENCE  (//LOCALE  STATE  (*CONN»)  j 

STREETNUMBER  (234) 

STREETNAME  (COLLEGE  AVENUE)) 

FIRSTNAME  (JOHN)  LASTNAME  (DOE)) 

The  followinR  example  illustrates  a slightly  different 
probl em  : 

(3)  LOUIS  CAPPIELLO,  YALE  POLICE  CHIEF 

In  order  to  figure  out  that  being  a YALE  POLICE  CHIEF  is  LOUIS 
CAPPIELLO's  occupation  we  first  have  to  collect  both  noun' groups. 

This  is  done  with  the  help  of  another  secondary  expectation 
called  EXTRA-NOUNGR  trap.  LOUIS  CAPPIELLO  generates; 

(//PERSON  FIRSTNAME  (LOUIS)  LASTNAME  (CAPPIELLO)) 

YALE  POLICE  CHIEF  generates- 

(//PERSON  OCCUPATION  (YALE-POLICE-CHIEF)) 

Then  another  secondary  expectation  tests  to  see  if  LASTNG  and 
EXTRANG  could  be  the  same  thing.  If  so,  the  two  groups  are 
merged  . 

Appositives  can  be  arbitrarily  complex:  from  simple  name 

groups  to  complicated  prepositional  phrases  and  relative  clauses. 

1 

Very  rarely  are  they  explicitly  expected.  They  are  handled  by 

i 


J 
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the  spoondary  expectations  based  on  the  general  properties  of 
things  and  the  knowledge  about  the  ways  these  things  can  be 
expressed  in  English.  TRAP  represents  an  attempt  to  implement 
the  mechanism  controlling  the  interaction  between  these 
expectations . 

TRAP  is  still  in  the  experimental  stage  of  development.  Its 
flow  of  control  is  rather  complex.  In  general,  first,  it  tries 
to  find  and  test  expectations  about  general  properties  of  the 
item  in  LASTNG.  For  example,  for  a person  it  tries  to  collect 
special  modifiers  such  as  name,  age,  and  address.  If  all  these 
expectations  fail,  TRAP  checks  for  possible  appositives  such  as 
simple  EXTRA  noun  groups,  prepositional  phrases,  or  relative 
subclauses.  If  one  of  these  appositives  is  collected,  TRAP  first 
checks  the  explicit  expectations  which  may  have  been  pending  (for 
example,  a WHICH-clause  might  want  to  be  attached  to  a particular 
physical  object)  and  ^-hen  checks  the  secondary  expectations 
again.  This  time,  it  may  catch  some  properties  which  it  missed 
the  first  time  because  they  were  uncoded  in  a more  complicated 
form.  In  order  to  clarify  this  description  let  us  follow  a few 
more  examples: 

(3)  JOHN  DOE  OF  GENERAL  MOTORS 

The  subgroup  OF  GENERAL  MOTORS  is  caught  by  TRAP'S  prepositional 
phrase  expectation.  Since  there  are  no  specific  expectations 
which  can  link  JOHN  DOE  and  GENERAL.  MOTORS,  the  default  one, 
attached  to  OF  is  checked.  Its  action  links  the  two  groups  as 


fol lows : 
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(^PERSON  FIRSTNAME  (JOHN)  LASTNAME  (DOE) 

SOMEREL  (//ORGANIZATION  ORGNAME  (GENERAL-MOTORS))) 

SOMEREL  means  that  we  do  not  really  know  the  exact  nature  of  the 
relations  between  JOHN  DOE  and  GENERAL  MOTORS. 

In  the  following  example 

US  NAVY  TASK  FORCE  WHICH  HAS  BEEN  ON  PATROL  DUTY  IN  THE  INDIAN 
OCEAN  (left  the  area). 

the  WHICH  clause  is  collected  by  TRAP'S  subclause  expectation  and 
is  attached  to  US  NAVY  TASK  FORCE  by  an  expectation  associated 
with  WHICH.  The  result  is; 

(//GR-ORG  PARTOF  (//ORGANIZATION  BRANCH  (NAVY) 

PARTOF  (*USA*)) 

REL  ((ACTOR  MODFOCUS 

<=>  (SPATROL  PLACE  ( »INDIAN-OCEAN* ) ) ) ) ) 

Subclause  processing  represents  a difficult  problem  on  its 
own.  The  problem  of  subclause  boundaries,  for  example,  is  as 
complex  as  that  of  noun  groups.  In  solving  it,  I used  the  same 
philosophy  as  for  noun  groups  boundaries:  the  current  subclause 
is  finished  when  the  next  word  is  not  expected  by  any 
expectations  from  that  subclause. 

The  traditional  stumbling  block  of  all  parsers  - AND 
conjunction  - is  also  handled  by  a series  of  TRAP  expectations. 
Although,  in  difficult  cases  we  cannot  avoid  backtracking,  simple 
cases  like 
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(5)  JOHN  AND  MARY  ATE  SOUP  AND  LASAGNA  AND  LEFT. 

can  be  processed  by  the  program  with  the  help  of  the  following 
heuristics.  If  AND  is  not  specifically  expected  and  occurs  in 
the  sentence  between  two  noun  groups  which  can  be  combined  in  one 
semantic  unit  then  it  is  interpreted  as  a link  between  the  two 
noun  groups.  Otherwise,  if  AND  occurs  in  the  sentence  after  the 
verb  it  is  interpreted  as  a link  between  two  clauses. 

All  examples  presented  so  far  deal  with  noun  groups 
describing  Picture  Producers.  The  next  example  shows  how  Concept 
Producers  are  handled. 

(6)  (Castro  condemned)  THE  E)(ECUTI0N  OF  THOUSANDS  OF  COMMUNISTS 
IN  INDONESIA. 


THE  EXECUTION 

refers  to 

the  script  $EXECUTI0N. 

This 

script 

has  ! 

among  its 

roles  the 

VICTIM  of  the  execut 

ion . 

Among 

the 

expectations 

associated 

with  the  script  ther 

e is 

one 

which 

expects  the  victim  to  be  a person  (or  a group  of  people) 
introduced  by  the  preposition  OF.  Hearing  the  word  EXECUTION 
sets  up  an  expectation  for  the  word  OF  (someone).  THOUSANDS  OF 
is  another  unit  which  creates  a group  whose  members  follow.  This 
expectation  is  satisfied  by  COMMUNISTS.  When  IN  INDONESIA  comes 
it  is  not  expected  by  anybody.  Hence,  the  noun  group  collection 
is  suspended  and  THE  EXECUTION  which  is  now  transformed  into; 

[$EXECUTION  VICTIM  (//GROUP  MEMBER  (//PERSON 

OCCUPATION  (COMMUNIST) 


COMPNUM  (ORDER  VAL  (1000] 
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is  placed  in  the  MOBJECT  slot  of  MTRANS  for  "condemned".  After 
this,  IN  INDONESIA  is  collected: 

(LOG  VAL  (»INSIDE»  PARTOF  ( *I NDONES lA ) ) ) 

Now  the  processor  must  decide  whether  Indonesia  was  the  place 
where  the  execution  occured  or  where  it  was  condemned  by  Castro. 
In  the  absence  of  other  expectations,  the  program  picks  the  first 
alternative . 

To  conclude  this  section,  I would  like  to  discuss  the 
treatment  of  words  unknown  to  the  program.  People  have  a limited 
ability  to  interpret  such  words  out  of  context,  or,  at  least,  to 
ignore  them.  We  tried  to  put  some  of  this  kind  of  intelligence 
in  our  programs.  The  problem  has  two  aspects.  First,  we  have  to 
figure  out  what  role  the  unknown  word  (or  words)  might  play  in 
the  sentence  and  then  interrogate  the  context  to  find  out  what 
meaning  this  word  might  have.  The  borderline  between  these  two 
tasks  is  very  vague.  As  of  now,  most  of  the  first  part  is 
handled  by  NGP  and  most  of  the  second  part  by  Rick  Granger's 
program  called  FOUL-UP  (Granger  1977).  The  following  examples 
illustrate  how  the  NGP  part  works. 

(7)  JOHN  ATE  A FOO  FISH. 

FOO  is  interpreted  as  an  unknown  modifier  and  ignored. 

(8)  JOHN  ATE  A BLUE  FOO. 


The  output  of  NGP 
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(#BOGUS  COLOUR  (BLUE)  LEXVAL  (FOO)  REF  (INDEF)) 

_s  handed  to  FOUL-UP  for  further  investigation. 

(9)  DR  FOO  BAZ  ATE  A BLUE  FISH. 

FOO  BAZ  are  interpreted  as  the  first  and  the  last  names  of  a 
person  whose  occupation  is  DOCTOR. 

(10)  FOO 'S  FISH  WAS  BAD. 

FOO  is  interpreted  as  the  last  name  unless  (9)  and  (10)  occurred 
in  the  same  story,  in  which  case  FOO  would  have  already  been 
defined  as  a first  name. 

(11)  JOHN  WAS  TAKEN  TO  THE  HOSPITAL  BY  FOO  AMBULANCE. 

FOO  is  interpreted  to  be  a name  of  an  ambulance  company,  since 
AMBULANCE  has  a BACKWARD  expectation  looking  for  a company  name. 

(12)  593  FOO  BAZ  AVENUE 

FOO  BAZ  is  interpreted  as  the  name  of  an  avenue. 

5.  Memory  and  LanpuaRe  Processinp, 

In  this  section  we  would  like  to  discuss  some  p.eneral 
problems  of  memory  and  und er stand  inp  as  related  to  one  very 
practical  task.  Oripinally  the  idea  to  write  a noun  proup  parser 
appeared  in  connection  with  our  preliminary  work  on  the  WFTS 
project.  As  mentioned  in  the  in  trod uc t ion , WFTS  is  a system 
desipned  to  understand  and  classify  isolated  newspaper  headline-. 
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on  international  relations.  The  classifications  of  headlines 
about  international  interactions  are  triples:  ^CTOR  (country), 
ACTION  (one  of  ?0  selected  international  interactions),  and 
TARGET  (country).  The  list  of  50  headlines  that  our  system  can 
handle  (as  of  March  1,  1977)  is  ^iven  in  Appendix  1. 

An  earlier  attempt  to  obtain  such  a classification  directly 
from  the  input  text  usino;  the  "simplest  possible  syntax  relatiye 
to  the  ACTOR-ACTION-TARGET  semantics"  failed  dramatically 
(Tripodes  et.  al.  1979).  It  was  clear  to  us  from  the  bei?innini’ 
that  in  order  to  correctly  encode  a sentence  one  has  to 
understand  it.  Further,  one  cannot  speak  about  understann  im-' 
without  meaninf'  representation  and  memory  models.  Conceptual 
Dependency  was  our  natural  choice  for  a meaning  repr esentation 
system.  As  for  the  memory,  we  thought  that  a yery  limited  model 
containing  only  basic  information  about  countries,  neoplc, 
physical  objects,  and  some  organizations  would  be  sufficient  fe>- 
the  task.  This  model  proved  to  be  inadequate.  To  determine  the 
meaning  of  even  simple  sentences  we  need  much  more  detail^', 
knowledge  about  current  and  past  relations  between  countries, 
their  size,  policies,  and  many  more  other  features.  Consider  the 
following  examples: 


(1)  LEBANESE  OFFICIALS  SEIZED  1500  RIFLES  FROM  BULGARIA 


Were  the  rifles  owned  by  Bulgaria,  made  in  Bulgaria,  or  came 


Some  of  the  examples  in  this  section  might  lock  cii^’ber some 
artificial,  but,  in  fact,  all  of  them  are  real  newspape» 
headlines  which  WEIS  had  to  process  and  classify. 


Bulgaria.  A reader  who  follows  international  relations  would 
know  that  it  is  highly  implausible  that  the  Lebanese  officials 
would  enter  into  direct  conflict  with  BulRaria  by  seizinp  its 
property.  An  informed  reader  would  also  know  that  there  are 
armed  groups  in  Lebanon  who  receive  supplies  from  Communist 
countries.  Thus,  he  would  conclude  that  the  rifles  probably  came 
from  Bulgaria  and  were  seized  from  an  unknown  party.  The  m<^aning 
of  (1)  can  be  expressed  in  CD  using  script  notation,  as  follows: 


(ACTOR  (//GR-ORG  MEMBER 

(# PERSON  PARTOF 

(//ORGANIZATION  TYPE  (GOVERNMENT) 

PARTOF  (LEBANON)))) 

<=>  ($SEIZE) 

FROM  X 

OBJECT  (//GROUP  MEMBER  Y)) 
where  Y is 

(//PHYSOBJ  TYPE  (WEAPON)  COMPNUM  ( 1500) 

REL  (ACTOR  (SOMEONE)  <=>  (ATRANS)  OBJECT  Y 

FROM  (DULGARIAl  TO 

X)) 

If,  on  the  other  hand,  the  headline  had  been  ISRAEL  SEIZED  RIFL'-IS 
FROM  EGYPT,  with  the  two  countries  engaged  in  a direct  conflict, 
then  a kowledgable  reader  would  have  probably  concluded  that  the 
rifles  were  seized  from  Fgvot.  Here  different  1 nternretn'- ions 
lead  to  different  WEIS  encodings  with  different  TARGETS  f r t,h- 
ACTIONS . 


J 
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The  difficulty  in  the  above  examples  comes  from  the 
ambi£?uity  of  the  word  FROM.  It  can  be  a link  between  the  verb 
SEIZE  and  its  indirect  object  or  it  can  link  a qualifier  to  a 
noun  group.  In  general,  prenositions  help  us  to  identify  the 
roles  of  played  by  the  words  they  precede,  but  very  often  they 
are  not  sufficient.  Consider  the  preposition  OY  in  the  following 
sentence : 

{2)  USA  PROTESTS  INDIA’S  ABANDONMENT  OF  NEUTRALITY  [<Y 
ESTABLISHING  FULL  DIPLOMATIC  RELATIONS  WITH  NORTH  VIETNAM 

Even  after  we  have  established  that  BY  introduces  trie  instrument 
of  an  action  (which  in  itself  is  a nontrivial  task),  we  still  do 
not  know  which  action  this  instrument  modifies.  Who  established 
full  diplomatic  relations  with  North  Vietnam,  the  USA  or  India? 
One  has  to  be  acquainted  with  the  corresponding  i'  .i*'ical 
situation  in  order  to  reject  the  first  interpretation  bv  m;*kin.' 
the  following  inference  that  the  USA  was  not  likely  t-  esta'-'l’.sri 
full  diplomatic  relations  with  North  Vietnam,  but.  India  w i r,  ml 
such  an  act  would,  in  fact,  be  a violation  of  neutral  itv  from  trie 
US  viewpoint. 

The  preposition  IN  is  even  more  troublesome: 

(3)  CASTRO  CONDEMNED  THE  EXECUTION  OF  COMMUNISTS  IN  INDONESIA 

Here  again  an  informed  reader  would  know  that  Castro  was  not. 
likely  to  pronounce  his  condemnations  in  Indonesia  and,  hence,  w 
conclude  that  it  was  the  execution  of  communists  wtiich  took  place 
in  that  country. 
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Another  difficulty  is  the  scope  of  the  prepositions. 
Consider: 

(4)  SOUTH  KOREA  SMASHES  7 NORTH  KOREAN  ESPIONAGE  RINGS  INVOLVING 
q SPIES  AND  14  COI.LABORATORS  IN  SEOUL,  TEAGU,  AND  POHANG 

For  some  reason  we  merge  9 spies  and  14  collaborators  in  a group 
of  23  individuals,  which  is  split  into  7 groups  that  are 
distributed  in  three  South  Korean  cities.  If,  instead  of  9 SPIES 
we  had  9 COMMUNICATION  SATELLITES,  then  we  would  have  placed  only 
the  14  collaborators  in  these  cities,  keeping  the  location  of  the 
satellites  unspecified. 

Semantic  ambiguity  does  nott  have  to  be  related  to  any 
particular  preposition.  Consider  the  following  example: 

(5)  CAMBODIA  HOSTS  USA  ASSISTANT  SECRETARY  OF  STATE  FOR  A 
BRIEFING  ON  THE  OUTCOME  OF  US  PRESIDENT  NIXON  VISIT  TO  CHINA 

Who  was  briefing  whom?  Our  knowledge  of  the  international 
situation  at  the  time  of  Nixon's  first  visit  to  China  tells  us 
that  it  was  the  USA  who  was  briefing  Cambodia. 

Even  a relatively  simple  noun  phrase  such  as  RUSSIAN  RADAR 
INSTALLATION  in 

(6)  ISRAEL  SEIZES  A RUSSIAN  RADAR  INSTALLATION  IN  EGYPT 

can  be  a source  of  a mistake  in  encoding.  Only  the  knowledge  of 
the  precise  nature  of  the  relations  between  Israel,  Egypt,  and 
the  USSR  allows  the  reader  to  conclude  that  the  radar  in  quest!  -n 
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was  made  in  rather  than  possessed  by  the  USSR,  and  that  the 
TARGET  of  the  Israeli  ACTION  was  Epypt  rather  than  Russia. 

The  correct  understand inp  and  classification  of  the  above 
examples  requires  very  detailed  knowledpe  of  international 
relations.  And  these  were  rather  simple  sentences  whose  meaninr 
seems  obvious  to  most  people.  Many  real  newspaper  headlines  are 
much  more  puzzlinp: 

(7)  JORDAN  SAID  THE  ARABS  FAILED  THE  TEST 

(8)  FORD  TO  NEW  YORK:  DROP  DEAD 

Suppose  now  that  we  have  a detailed  model  of  the  political 
world  which  enables  us  to  make  all  necessary  inferences  about 
international  affairs.  Will  such  a model  be  sufficient  for  the 
correct  understanding  of  political  headlines?  On  the  surface  the 
answer  is  yes.  With  such  a model  we  would  be  able  to  make  all 
the  inferences  we  needed  in  our  analysis  of  the  examples  in  this 
section.  But  note  that  in  our  discussion  of  these  examples  we 
only  listed  the  necessary  inferences.  We  said  nothinp  about  how 
we  arrived  at  the  necessity  to  use  these  particular  inferences. 
In  other  words,  the  memory  itself  is  not  enouRh.  We  need  to  know 
how  to  Ret  the  parser  to  ask  the  memory  the  ripht  questions. 
This  paper  describes  in  detail  how  such  questions  are  treated 
inside  EnRlish  noun  Rroups.  Most  of  the  examples  in  this  section 
RO  beyond  the  noun  proup  framework.  We  were  able  to  handle  them 
by  attachinR  the  ad  hoc  requests  to  the  secondary  expectations 
deefininR  instrumental  and  locative  prepositions.  This  is  n 
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always  a satisfactory  solution,  ani  findinr,  a fl;eneral  solution  to 
this  problem  is  one  of  the  areas  of  our  current  research. 

6.  Comparison  with  other  Works  and  Conclusions 

The  work  presented  in  this  paper  is  the  further  development 
of  ELI.  The  main  difference  between  this  protrram  and  most  other 
parsers  (see,  for  example,  WinoRrad  197?,  Woods  and  Kaplan  1971) 
is  that  it  does  not  separate  its  linRuistic  knowledge  from  it.s 
Reneral  world  knowledRe.  In  other  programs  the  analysis  is  done 
in  two  stages.  First  the  input  is  analysed  syntactically  and 
then  the  result  is  interpreted  semantically.  For  example,  LliNAR 
(Woods  and  Kaplan  1971)  uses  the  Augmented  Transition  Network 
Grammar  (Woods  1970)  to  generate  possible  syntactic 
i nter pretat ions  of  a given  sentence  and  then  applies  its  domain 
knowledge  to  determine  whether  the  interpretation  is  meaningful . 
Thus,  noun  groups  are  parsed  purely  syntactically  and  their 
meaning  is  not  established  until  the  whole  sentence  is  paused. 
In  each  noun  group  the  first  noun  is  assumed  to  he  the  head  noun. 
If  later  this  turns  out  to  be  incorrect,  the  system  backs  up  and 
tries  to  accumulate  more  elements  into  the  noun  group.  For 
example,  the  correct  processing  of  the  phrase  PRESIDENT  JIMMv 
CARTER  which  contains  three  nouns  will  require  LUNAR  to  back  up 

wice.  This  means  that  a great  deal  of  unnecessary  effort  is 

spent  in  finding  syntactically  plausible  but  meaningless  parses. 
This  is  especially  true  when  one  tries  to  relax  some  syntactic 
rules  to  allow  for  slightly  incorrect  sentences.  In  NGi'  the 

parsing  is  done  with  the  use  of  rules  most  appropriate  jn  i--;\’en 


P a P,  e 1 


situation,  semantic  and  syntactic.  Thus,  in  the  example  above, 
the  proprams  contained  in  the  dictionary  entry  for  the  word 
PRESIDENT  will  immediately  collect  the  JIMMY  CARTER.  Most  of  the 
propram's  linpuistic  knowledpe  is  not  built  into  its  control 
structure  but  stored  in  the  dictionaries  and  used  as  a part  of 
its  peneral  knowledpe.  This  makes  the  propram  very  flexible, 
easily  extendable,  and  provides  for  the  correct  processinp  of 
"unprammatical"  sentences. 

Another  important  difference  between  this  propram  and  both 
Winoprad's  and  the  LUNAR  systems  is  in  the  representation  of 
meaninp.  The  meaninp  of  a sentence  in  Winoprad's  system  is  a 
propram  for  manipulatinp  blocks.  The  meaninp  of  a sentence  in 
the  LUNAR  system  is  a request  for  information  about  some 
properties  of  the  rocks  form  the  Moon.  Both  these  systems  are 
very  specialized  and  not  easily  extendable  to  other  domains.  Our 
analyzer  is  based  on  the  Conceptual  Dependency  r epr esent  'ti nn 
system  which  is  not  limited  to  any  particular  domain.  The  sa'se 
propram  handle  a wide  variety  of  topics  from  car  accident  reports 
to  state  visits  to  China. 


The 
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possibility  and  the  advantapes  of  the  simultaneous  application  of 
both  kinds  of  knowledpe,  without  separatinp  the  process  of 
und  er  stand  inp  into  syntactic  an  1 semantic  stapes.  The  pri'i’ram 
provides  an  intuitively  plausible  model  tor  a hi^ran  ’ 
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Aopendix  1 

Sentences  processed  by  the  YALE-WEIS  program 

1.  LAO  FORCES  ABANDON  BAN-NHIL  TO  NORTH  VIETNAM. 

2.  USA  NAVY  TASK  FORCE  WHICH  HAS  BEEN  ON  PATROL  DUTY  IN  THE 
INDIAN  OCEAN  FOR  A MONTH  LEAVES  THE  AREA. 

3.  CUBA  GRANTS  ASYLUM  TO  A USA  MARINE. 

4.  FRANCE  SELLS  50  MIRAGE  JET  PLANES  TO  LIBYA. 

5.  USA  APPOLLO  12  ASTRONAUTS  VISIT  INDONESIA. 

6.  ISRAELI  TASK  FORCE  SEIZES  UA R RADAR  INSTALLATION  ON  SHADWAN. 

7.  LEBANESE  OFFICIALS  SEIZED  1500  RIFLES  FROM  BULGARIA. 

8.  GUINEA  EXPELS  1 SPANISH  CITIZEN. 

q.  AUSTRIA  EXPELLED  4 CHINESE  IN  A CONTROVERSY  OVER  THEIR  STATUS 
AND  ACTIVITIES. 

10.  CASTRO  CONDEMNED  THE  EXECUTION  OF  THOUSANDS  OF  COMMUNISTS  IN 
INDONESIA. 

11.  SUKARNO  EXPLAINED  THE  EXPULSION  OF  THE  USA  NEWSMEN. 

12.  JORDAN  SAID  ARABS  FAILED  THE  TEST. 

13.  ALGERIA  PROTESTED  TO  SPAIN  THE  DETENTION  OF  AN  Al.GERIAN 
DIPLOMAT  IN  CONNECTION  WITH  MURDER  OF  AN  OPPOSITION  LEADER. 

14.  USA  f'ONCEDE  THAT  USA  AIR  UNITS  MIGHT  HAVE  HIT  A CAMBODIAN 
VILLAGE. 

15.  PRIME  minister  WILSON  SENT  A NOTE  CONCERNING  THE  VIETNAM  WAR 
TO  PREMIER  KOSYGIN. 

16.  VATICAN  PRAISED  UNITED  KINGDOM  EFFORTS  TOWARD  PEACE  IN 
VIETNAM. 

17.  PRESIDENT  JOHNSON  SENT  CONGRATULATORY  MESSAGE  TU  PRIME 
MINISTER  INDIRA  GANDHI. 

18.  THAILAND  SAYS  IT  WILL  SOON  SEND  1000  TROOPS  TO  VIETNAM. 

19.  USA  PRESIDENT  PROMISED  ISRAELI  PRIME  MINISTER  HE  WOULD  GIVE 
CONSIDERATION  TO  ISRAELI  REQUESTS  FOR  ARMS. 

20.  SPAIN  GIVES  BACK  TERRITORY  OF  IFNI  TO  MOROCCO. 

21.  SPAIN  GIVES  POSSESSIONS  OF  HISTORIAN  GAR^ILASO  TO  !'ER". 
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22.  UAR  FORCES  ARE  BOLSTERED  BY  KUWAIT. 

23.  US  PRESIDENT  ANNOUNCED  THAT  AUSTRIAN  CHANCELLOR  ACCEPTED  IJS 
INVITATION  TO  VISIT  THE  USA. 

24.  USA  GENERAL  SAYS  NORTH  VIETNAM  HAS  UPHELD  THE  BOMBING 
AGREEMENT. 

25.  SPAIN  AND  RUMANIA  SIGNED  AGREEMENT  ESTABLISHING  FULL  CONSULAR 
AND  COMMERCIAL  RELATIONS. 

26.  KENIA  SIGNS  INTERNATIONAL  COFFEE  AGREEMENT  OF  1Q62. 

27.  USA,  UNITED  KINGDOM,  NETHERLAND,  NORWAY  ASSIGNED  WARSHIPS  TO 
NEW  PERMANENT  FORCE  OF  NATO. 

28.  FUNERAL  OF  INDIA’S  SHASTRI  ATTENDED  BY  USSR  KOSYGIN,  USA 
HUMPHREY,  UNITED  KINGDOM'S  BROWN,  AFGANISTAN'S  MAIMANDA, 
PAKISTAN'S  FARUQUE,  AND  REPRESENTATIVE  OF  U THANT. 

29.  CAMBODIA  HOSTS  USA  ASSISTANT  SECRETARY  OF  STATE  GREEN  FOR  A 
BRIEFING  ON  THE  OUTCOME  OF  US  PRESIDENT  NIXON  VISIT  TO  CHINA. 

30.  SOUTH  VIETNAMESE  FOREIGN  MINISTER  TRAM  VAN  LAM  SAYS  THE  SOUTH 
VIETNAMESE  GOVERNMENT  APPROVES  THE  FINAL  USA  = CHINA 
COMMUNIQUE  AND  FEELS  IT  UPHOLDS  THE  USA  COMMITMENTS  TO  SOUTH 
VIETNAM. 

HI.  US  ASSISTANT  SECRETARY  FOR  EAST  ASIAN  AFFAIRS  MARSHALL  nREFN 
REAFFIRMS  USA  DEFENSE  COMMITMENT  TO  TAIWAN  AND  SAYS  THE  USA 
WILL  CONTINUE  DIPLOMATIC  RELATIONS  WITH  THE  TAIWANF:'- 

GOVERNMENT. 

32.  NORTH  VIETNAM  TO  ESTABLISH  FULL  DIPLOMATIC  RELATIONS  WIT- 
SWEDEN. 

33.  USA  PROTESTS  INDIA'S  ABANDONMENT  OF  NEUTRALITY  BY 

ESTABLISHING  FULL  DIPLOMATIC  RELATIONS  WITH  NORTH  VIETNAM. 

34.  TAIWAN  AND  UN  SIGNED  AGREEMENT  TO  BUIl.D  TYPHOON  AND  FLOOD 
WARNING  SYSTEM. 

35.  CHINA  EXPELLED  ITALIAN  MISSION  BECAUSE  TRIP  BIFSSED  BY  ^’OPE. 

36.  WEST  GERMANY  CAUGHT  5 SOVIET  CITIIENS  '-PYING  ON  WEST  GERMANY. 

38.  SYRIA  AND  ISRAEL  EXCHANGE  FIRE. 

39.  NORTH  VIETNAM  ASKED  THE  USCP  AN’  ' NA  T N"IN!:E  AID  T”  E:T:'. 
COUNTRY. 

40.  THAI  MILITARY  SOURCES  ACC'.C.K’'  A'-"  A 'E  EIHIN:.  ON  C’HAT 
TERRITORY. 

41.  WEST  GERMANY  REJECTS  USSR  ''RITl'I'’-'  NAT-  AN-  'Vf- R."  , 
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42.  CZECHOSLOVAKIA  REFUSES  TO  LET  USA  STUDENTS  ENTER 
CZECHOSLOVAKIA. 

43.  USSR  CANCELS  INDONESIAN  FOREIGN  MINISTER  VISIT  TO  MOSCOW. 

44.  USA  PRESIDENT  SIGNED  EXECUTIVE  ORDER  TO  CUT  OFF  TRADE  WITH 
RHODESIA. 

45.  SOUTH  KOREA  SMASHES  7 NORTH  KOREAN  ESPIONAGE  RINGS  INVOLVING 
9 SPIES  AND  14  COLLABORATORS  IN  SEOUL,  TAEGU,  AND  THE  EASTERN 
PORT  OF  POHANG. 

46.  HONDURAS  SAID  IT  HAD  EXPELLED  SOME  SALVADORIANS  FOR  ILLEGAL 
IMMIGRATION . 

47.  CHINA  DEMONSTRATES  IN  PEKING  AT  USSR  EMBASSY. 

48.  USSR  MILITARY  UNITS  PARTICIPATE  IN  MONGOLIAN  PARADE. 

49.  NIGERIA  TAKES  BIAFRAN  PROVISIONAL  CAPITAL  OF  OWERRI. 


50.  LEBANESE  TOWNSPEOPLE  SET  FIRE  TO  ARAB  COMMANDO  OFFICE. 


